-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[core] Introduce binlog system table to pack the UB and UA #4520
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like we can create a new system table instead of an option.
Yes, it's better. Updated. |
+------------------+----------------------+-----------------------+ | ||
| rowkind | column_0 | column_1 | | ||
+------------------+----------------------+-----------------------+ | ||
| +I | [col_0, null] | [col_1, null] | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just one element?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The schema should keep same, so this is an array only with one element.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean one element, not one element and a null, two elements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Get it, updated.
* | ||
* <p>DELETE: [-D, [co1, null], [col2, null]] | ||
*/ | ||
public class BinlogTable implements DataTable, ReadonlyTable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we reuse code with AuditLogTable
? Like extending it to reuse something.
You can streaming query the binlog through binlog table. You can get the update before and update after in one row. Note: | ||
|
||
{{< hint info >}} | ||
1. Only support streaming query the binlog |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only support streaming query the binlog.
import java.io.IOException; | ||
import java.util.function.BiFunction; | ||
|
||
/** The reader which will pack the update before and updater after message together. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"updater after" to "update after"
} | ||
InternalRow row2 = null; | ||
if (row1.getRowKind() == RowKind.UPDATE_BEFORE) { | ||
row1 = serializer.copy(row1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why copy instead of using row1 directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The underlay's row fetched from the nextRow
could be reused.
|
||
public static final String BINLOG = "binlog"; | ||
|
||
public static final PredicateReplaceVisitor PREDICATE_CONVERTER = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private
|
||
@Override | ||
public RowType rowType() { | ||
List<DataField> fields = new ArrayList<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
init map with capacity
* | ||
* <p>DELETE: [-D, [co1, null], [col2, null]] | ||
*/ | ||
public class BinlogTable implements DataTable, ReadonlyTable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be can extract some common code with the AuditLogTable system table.
@@ -162,6 +162,12 @@ under the License. | |||
<artifactId>iceberg-data</artifactId> | |||
<version>${iceberg.version}</version> | |||
<scope>test</scope> | |||
<exclusions> | |||
<exclusion> | |||
<artifactId>parquet-avro</artifactId> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do this here?
It seems has no relation with this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the first commit message
@@ -132,6 +132,40 @@ public void testSnapshotsTable() throws Exception { | |||
Row.of(3L, 0L, "APPEND")); | |||
} | |||
|
|||
@Test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think add a class of BinlogTableTest in package "org.apache.paimon.table.system" may better?
Addressed the comments, pls take a look again @JingsongLi @wwj6591812 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Purpose
Linked issue: close #4505
Tests
API and Format
Documentation